Chi-squared test is a statistical test used to compare observed data we would expect to obtain according to a specific hypothesis. The chi-squared test is always testing the null hypothesis. Our null hypothesis is that there is no difference in underlying frequencies in jelly babies in the two habitats. The p value is not the chance that the null hypothesis is true or false it is the probability that we would get a result this different or bigger if the null hypothesis is true. A low p value doesn’t prove that the difference is not due to sampling error or that the null hypothesis is false.
Population
Black
Green
Orange
Red
Yellow
Total (rows)
1
2
8
7
2
1
20
2
1
10
4
2
3
20
3
1
9
0
9
1
20
4
1
9
0
9
1
20
5
4
1
6
6
3
20
6
5
8
2
2
3
20
Total (columns)
14
45
20
30
12
120
Expected values Black: 14/6=2.3 Green: 45/6=7.5 Orange: 20/6=3.3 Red: 30/6=5 Yellow: 12/6=2
We will combine the population of black and yellow jelly babies as the expected value is less than 3 in both cases.
Population
Black and Yellow
Green
Orange
Red
Total (rows)
1
3
8
7
2
20
2
4
10
4
2
20
3
2
9
0
9
20
4
2
9
0
9
20
5
7
1
6
6
20
6
8
8
2
2
20
Total (columns)
26
45
20
30
120
Expected values Black and Yellow: 26/6=4.3 Green: 45/6=7.5 Orange: 20/6=3.3 Red: 30/6=5 Independent replication – Bush habitat
Population
Black and Yellow
Green
Orange
Red
Total (rows)
1
3
8
7
2
20
2
4
10
4
2
20
3
2
9
0
9
20
Total (columns)
9
27
11
11
60
Expected values Black and Yellow: 9/3=3 Green: 27/3=9 Orange: 11/3=3.7 Red: 11/3=3.7 Black and Yellow
Population
O-E
(O-E)²
(O-E)²/E
1
0
0
0
2
1
1
0.33
3
-1
1
0.33
Green
Population
O-E
(O-E)²
(O-E)²/E
1
-1
1
0.11
2
1
1
0.11
3
0
0
0
Orange
Population
O-E
(O-E)²
(O-E)²/E
1
3.3
10.89
2.94
2
0.3
0.09
0.024
3
-3.7
13.69
3.7
Red
Population
O-E
(O-E)²
(O-E)²/E
1
-1.7
2.89
0.78
2
-1.7
2.89
0.78
3
5.3
28.09
7.6
Chi-squared value = 16.7 P value = 0.01
Critical value = 16.81
The chi-squared value is less than the critical value. When the chi-squared value is less than or equal to 16.81 we can expect it to occur 99% of the time. To reject the null hypothesis the chi-squared value (16.7) has to be greater than the critical value (16.81) in this case it is not so we fail to reject the null hypothesis. This means that the probability that there is no difference in underlying frequencies in jelly babies in bushes is 99%.
Independent replication – Grass habitat
Population
Black and Yellow
Green
Orange
Red
Total (rows)
1
2
9
0
9
20
2
7
1
6
6
20
3
8
8
2
2
20
Total (columns)
17
18
8
17
60
Expected values Black and Yellow: 17/3=5.67 Green: 18/3=6 Orange: 8/3=2.67 Red: 17/3=5.67 Black and Yellow
Population
O-E
(O-E)²
(O-E)²/E
1
-3.67
13.46
2.38
2
1.33
1.76
0.31
3
2.33
5.42
0.96
Green
Population
O-E
(O-E)²
(O-E)²/E
1
3
9
1.5
2
-5
25
4.17
3
2
4
0.67
Orange
Population
O-E
(O-E)²
(O-E)²/E
1
-2.67
7.12
2.67
2
3.33
11.08
4.15
3
-0.67
0.44
0.17
Red
Population
O-E
(O-E)²
(O-E)²/E
1
3.33
11.08
1.96
2
0.33
0.10
0.019
3
-3.67
13.46
2.38
Chi-squared value = 21.3
P value = 0.002
Critical value = 20.25
The chi-squared value (21.3) is larger than the critical value (20.25). So in this case we can reject the null hypothesis. P= 0.002 so we can be 98% sure there is a significant difference in the underlying frequencies in jelly babies in the grass habitat.
Group members:
- Martha Foiani
- Karina Kaur
- Nisha Patel
- Sophia Walsh
- Madeleine Berry
Results
Bush Bush Bush Grass Grass Grass
1
Black Green Orange Red Yellow
2 8 7 2 1
2
Black Green Orange Red Yellow
1 10 4 2 3
3
Black Green Red Yellow
1 9 9 1
4
Black Green Red Yellow
1 9 9 1
5
Black Green Orange Red Yellow
4 1 6 6 3
6
Black Green Orange Red Yellow
5 8 2 2 3
Chi-squared test is a statistical test used to compare observed data we would expect to obtain according to a specific hypothesis. The chi-squared test is always testing the null hypothesis. Our null hypothesis is that there is no difference in underlying frequencies in jelly babies in the two habitats. The p value is not the chance that the null hypothesis is true or false it is the probability that we would get a result this different or bigger if the null hypothesis is true. A low p value doesn’t prove that the difference is not due to sampling error or that the null hypothesis is false.
Expected values
Black: 14/6=2.3
Green: 45/6=7.5
Orange: 20/6=3.3
Red: 30/6=5
Yellow: 12/6=2
We will combine the population of black and yellow jelly babies as the expected value is less than 3 in both cases.
Expected values
Black and Yellow: 26/6=4.3
Green: 45/6=7.5
Orange: 20/6=3.3
Red: 30/6=5
Independent replication – Bush habitat
Expected values
Black and Yellow: 9/3=3
Green: 27/3=9
Orange: 11/3=3.7
Red: 11/3=3.7
Black and Yellow
Green
Orange
Red
Chi-squared value = 16.7
P value = 0.01
Critical value = 16.81
The chi-squared value is less than the critical value. When the chi-squared value is less than or equal to 16.81 we can expect it to occur 99% of the time. To reject the null hypothesis the chi-squared value (16.7) has to be greater than the critical value (16.81) in this case it is not so we fail to reject the null hypothesis. This means that the probability that there is no difference in underlying frequencies in jelly babies in bushes is 99%.
Independent replication – Grass habitat
Expected values
Black and Yellow: 17/3=5.67
Green: 18/3=6
Orange: 8/3=2.67
Red: 17/3=5.67
Black and Yellow
Green
Orange
Red
Chi-squared value = 21.3
P value = 0.002
Critical value = 20.25
The chi-squared value (21.3) is larger than the critical value (20.25). So in this case we can reject the null hypothesis. P= 0.002 so we can be 98% sure there is a significant difference in the underlying frequencies in jelly babies in the grass habitat.